AITopics | heterogeneous cluster

Collaborating Authors

heterogeneous cluster

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

High-Throughput LLM inference on Heterogeneous Clusters

Xiong, Yi, Huang, Jinqi, Huang, Wenjie, Yu, Xuebing, Li, Entong, Ning, Zhixiong, Zhou, Jinhua, Zeng, Li, Chen, Xin

arXiv.org Artificial IntelligenceApr-23-2025

Nowadays, many companies possess various types of AI accelerators, forming heterogeneous clusters. Efficiently leveraging these clusters for high-throughput large language model (LLM) inference services can significantly reduce costs and expedite task processing. However, LLM inference on heterogeneous clusters presents two main challenges. Firstly, different deployment configurations can result in vastly different performance. The number of possible configurations is large, and evaluating the effectiveness of a specific setup is complex. Thus, finding an optimal configuration is not an easy task. Secondly, LLM inference instances within a heterogeneous cluster possess varying processing capacities, leading to different processing speeds for handling inference requests. Evaluating these capacities and designing a request scheduling algorithm that fully maximizes the potential of each instance is challenging. In this paper, we propose a high-throughput inference service system on heterogeneous clusters. First, the deployment configuration is optimized by modeling the resource amount and expected throughput and using the exhaustive search method. Second, a novel mechanism is proposed to schedule requests among instances, which fully considers the different processing capabilities of various instances. Extensive experiments show that the proposed scheduler improves throughput by 122.5% and 33.6% on two heterogeneous clusters, respectively.

large language model, machine learning, throughput, (21 more...)

arXiv.org Artificial Intelligence

2504.15303

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Dataset Distillation-based Hybrid Federated Learning on Non-IID Data

Shi, Xiufang, Zhang, Wei, Wu, Mincheng, Liu, Guangyi, Wen, Zhenyu, He, Shibo, Shah, Tejal, Ranjan, Rajiv

arXiv.org Artificial IntelligenceSep-25-2024

In federated learning, the heterogeneity of client data has a great impact on the performance of model training. Many heterogeneity issues in this process are raised by non-independently and identically distributed (Non-IID) data. This study focuses on the issue of label distribution skew. To address it, we propose a hybrid federated learning framework called HFLDD, which integrates dataset distillation to generate approximately independent and equally distributed (IID) data, thereby improving the performance of model training. Particularly, we partition the clients into heterogeneous clusters, where the data labels among different clients within a cluster are unbalanced while the data labels among different clusters are balanced. The cluster headers collect distilled data from the corresponding cluster members, and conduct model training in collaboration with the server. This training process is like traditional federated learning on IID data, and hence effectively alleviates the impact of Non-IID data on model training. Furthermore, we compare our proposed method with typical baseline methods on public datasets. Experimental results demonstrate that when the data labels are severely imbalanced, the proposed HFLDD outperforms the baseline methods in terms of both test accuracy and communication cost.

dataset, heterogeneous cluster, hfldd, (13 more...)

arXiv.org Artificial Intelligence

2409.17517

Country:

Asia > China > Zhejiang Province > Hangzhou (0.04)
Europe > United Kingdom > England > Tyne and Wear > Newcastle (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization

Zhao, Juntao, Wan, Borui, Peng, Yanghua, Lin, Haibin, Wu, Chuan

arXiv.org Artificial IntelligenceMar-2-2024

Recent breakthroughs in Large-scale language models (LLMs) have demonstrated impressive performance on various tasks. The immense sizes of LLMs have led to very high resource demand and cost for running the models. Though the models are largely served using uniform high-caliber GPUs nowadays, utilizing a heterogeneous cluster with a mix of available high- and low-capacity GPUs can potentially substantially reduce the serving cost. There is a lack of designs to support efficient LLM serving using a heterogeneous cluster, while the current solutions focus on model partition and uniform compression among homogeneous devices. This paper proposes LLM-PQ, a system that advocates adaptive model quantization and phase-aware partition to improve LLM serving efficiency on heterogeneous GPU clusters. We carefully decide on mixed-precision model quantization together with phase-aware model partition and micro-batch sizing in distributed LLM serving with an efficient algorithm, to greatly enhance inference throughput while fulfilling user-specified model quality targets. Extensive experiments on production inference workloads in 11 different clusters demonstrate that LLM-PQ achieves up to 2.88x (2.26x on average) throughput improvement in inference, showing great advantages over state-of-the-art works.

gpus, llm-pq, quantization, (16 more...)

arXiv.org Artificial Intelligence

2403.01136

Country:

Asia > China > Hong Kong (0.05)
North America > United States (0.04)
Asia > China > Liaoning Province > Shenyang (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Fair Oversampling Technique using Heterogeneous Clusters

Sonoda, Ryosuke

arXiv.org Artificial IntelligenceMay-23-2023

Class imbalance and group (e.g., race, gender, and age) imbalance are acknowledged as two reasons in data that hinder the trade-off between fairness and utility of machine learning classifiers. Existing techniques have jointly addressed issues regarding class imbalance and group imbalance by proposing fair over-sampling techniques. Unlike the common oversampling techniques, which only address class imbalance, fair oversampling techniques significantly improve the abovementioned trade-off, as they can also address group imbalance. However, if the size of the original clusters is too small, these techniques may cause classifier overfitting. To address this problem, we herein develop a fair oversampling technique using data from heterogeneous clusters. The proposed technique generates synthetic data that have class-mix features or group-mix features to make classifiers robust to overfitting. Moreover, we develop an interpolation method that can enhance the validity of generated synthetic data by considering the original cluster distribution and data noise. Finally, we conduct experiments on five realistic datasets and three classifiers, and the experimental results demonstrate the effectiveness of the proposed technique in terms of fairness and utility.

artificial intelligence, classifier, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.ins.2023.119059

2305.13875

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Ireland (0.04)
Europe > Greece > Attica > Athens (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Enabling fairer data clusters for machine learning

#artificialintelligenceAug-10-2020, 00:25:51 GMT

Research published recently by CSE investigators can make training machine learning (ML) models fairer and faster. With a tool called AlloX, Prof. Mosharaf Chowdhury and a team from Stony Brook University developed a new way to fairly schedule high volumes of ML jobs in data centers that make use of multiple different types of computing hardware, like CPUs, GPUs, and specialized accelerators. As these so-called heterogeneous clusters grow to be the norm, fair scheduling systems like AlloX will become essential to their efficient operation. This project is a new step for Chowdhury's lab, which has recently published a number of tools aimed at speeding up the process of training and testing ML models. Their past projects Tiresias and Salus sped up GPU resource sharing at multiple scales: both within a single GPU (Salus) and across many GPUs in a cluster (Tiresias).

artificial intelligence, hardware, machine learning, (16 more...)

#artificialintelligence

Country: North America > United States > New York > Suffolk County > Stony Brook (0.26)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.44)

Add feedback

Enabling fairer data clusters for machine learning

#artificialintelligenceJul-21-2020, 06:20:17 GMT

artificial intelligence, chowdhury, machine learning, (14 more...)

#artificialintelligence

Country: North America > United States > New York > Suffolk County > Stony Brook (0.25)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.43)

Add feedback